software quality
RefAgent: A Multi-agent LLM-based Framework for Automatic Software Refactoring
Oueslati, Khouloud, Lamothe, Maxime, Khomh, Foutse
Large Language Models (LLMs) have substantially influenced various software engineering tasks. Indeed, in the case of software refactoring, traditional LLMs have shown the ability to reduce development time and enhance code quality. However, these LLMs often rely on static, detailed instructions for specific tasks. In contrast, LLM-based agents can dynamically adapt to evolving contexts and autonomously make decisions by interacting with software tools and executing workflows. In this paper, we explore the potential of LLM-based agents in supporting refactoring activities. Specifically, we introduce RefAgent, a multi-agent LLM-based framework for end-to-end software refactoring. RefAgent consists of specialized agents responsible for planning, executing, testing, and iteratively refining refactorings using self-reflection and tool-calling capabilities. We evaluate RefAgent on eight open-source Java projects, comparing its effectiveness against a single-agent approach, a search-based refactoring tool, and historical developer refactorings. Our assessment focuses on: (1) the impact of generated refactorings on software quality, (2) the ability to identify refactoring opportunities, and (3) the contribution of each LLM agent through an ablation study. Our results show that RefAgent achieves a median unit test pass rate of 90%, reduces code smells by a median of 52.5%, and improves key quality attributes (e.g., reusability) by a median of 8.6%. Additionally, it closely aligns with developer refactorings and the search-based tool in identifying refactoring opportunities, attaining a median F1-score of 79.15% and 72.7%, respectively. Compared to single-agent approaches, RefAgent improves the median unit test pass rate by 64.7% and the median compilation success rate by 40.1%. These findings highlight the promise of multi-agent architectures in advancing automated software refactoring.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > Canada > Quebec > Montreal (0.05)
- North America > United States > New York > New York County > New York City (0.05)
- (8 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.94)
Navigating the growing field of research on AI for software testing -- the taxonomy for AI-augmented software testing and an ontology-driven literature survey
In industry, software testing is the primary method to verify and validate the functionality, performance, security, usability, and so on, of software-based systems. Test automation has gained increasing attention in industry over the last decade, following decades of intense research into test automation and model-based testing. However, designing, developing, maintaining and evolving test automation is a considerable effort. Meanwhile, AI's breakthroughs in many engineering fields are opening up new perspectives for software testing, for both manual and automated testing. This paper reviews recent research on AI augmentation in software test automation, from no automation to full automation. It also discusses new forms of testing made possible by AI. Based on this, the newly developed taxonomy, ai4st, is presented and used to classify recent research and identify open research questions.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > Canada > Ontario > National Capital Region > Ottawa (0.14)
- South America > Brazil > Bahia > Salvador (0.04)
- (25 more...)
- Overview (1.00)
- Research Report > New Finding (0.54)
QualiTagger: Automating software quality detection in issue trackers
Shivashankar, Karthik, Capilla, Rafael, Kruke, Maren Maritsdatter, Orucevic, Mili, Martini, Antonio
A systems quality is a major concern for development teams when it evolve. Understanding the effects of a loss of quality in the codebase is crucial to avoid side effects like the appearance of technical debt. Although the identification of these qualities in software requirements described in natural language has been investigated, most of the results are often not applicable in practice, and rely on having been validated on small datasets and limited amount of projects. For many years, machine learning (ML) techniques have been proved as a valid technique to identify and tag terms described in natural language. In order to advance previous works, in this research we use cutting edge models like Transformers, together with a vast dataset mined and curated from GitHub, to identify what text is usually associated with different quality properties. We also study the distribution of such qualities in issue trackers from openly accessible software repositories, and we evaluate our approach both with students from a software engineering course and with its application to recognize security labels in industry.
- Europe > Norway > Eastern Norway > Oslo (0.04)
- South America (0.04)
- North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
The Enhancement of Software Delivery Performance through Enterprise DevSecOps and Generative Artificial Intelligence in Chinese Technology Firms
This study investigates the impact of integrating DevSecOps and Generative Artificial Intelligence (GAI) on software delivery performance within technology firms. Utilizing a qualitative research methodology, the research involved semi-structured interviews with industry practitioners and analysis of case studies from organizations that have successfully implemented these methodologies. The findings reveal significant enhancements in research and development (R&D) efficiency, improved source code management, and heightened software quality and security. The integration of GAI facilitated automation of coding tasks and predictive analytics, while DevSecOps ensured that security measures were embedded throughout the development lifecycle. Despite the promising results, the study identifies gaps related to the generalizability of the findings due to the limited sample size and the qualitative nature of the research. This paper contributes valuable insights into the practical implementation of DevSecOps and GAI, highlighting their potential to transform software delivery processes in technology firms. Future research directions include quantitative assessments of the impact on specific business outcomes and comparative studies across different industries.
TestLab: An Intelligent Automated Software Testing Framework
Dias, Tiago, Batista, Arthur, Maia, Eva, Praça, Isabel
The prevalence of software systems has become an integral part of modern-day living. Software usage has increased significantly, leading to its growth in both size and complexity. Consequently, software development is becoming a more time-consuming process. In an attempt to accelerate the development cycle, the testing phase is often neglected, leading to the deployment of flawed systems that can have significant implications on the users daily activities. This work presents TestLab, an intelligent automated software testing framework that attempts to gather a set of testing methods and automate them using Artificial Intelligence to allow continuous testing of software systems at multiple levels from different scopes, ranging from developers to end-users. The tool consists of three modules, each serving a distinct purpose. The first two modules aim to identify vulnerabilities from different perspectives, while the third module enhances traditional automated software testing by automatically generating test cases through source code analysis.
- North America > United States (0.46)
- North America > Canada (0.04)
- Europe > Portugal > Porto > Porto (0.04)
A Hierarchical Deep Neural Network for Detecting Lines of Codes with Vulnerabilities
Software vulnerabilities, caused by unintentional flaws in source codes, are the main root cause of cyberattacks. Source code static analysis has been used extensively to detect the unintentional defects, i.e. vulnerabilities, introduced into the source codes by software developers. In this paper, we propose a deep learning approach to detect vulnerabilities from their LLVM IR representations based on the techniques that have been used in natural language processing. The proposed approach uses a hierarchical process to first identify source codes with vulnerabilities, and then it identifies the lines of codes that contribute to the vulnerability within the detected source codes. This proposed two-step approach reduces the false alarm of detecting vulnerable lines. Our extensive experiment on real-world and synthetic codes collected in NVD and SARD shows high accuracy (about 98\%) in detecting source code vulnerabilities.
- North America > United States > Florida > Escambia County > Pensacola (0.04)
- Europe > Portugal > Coimbra > Coimbra (0.04)
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (0.34)
PROFES - CISE 2022
In the last decades, software systems become pervasive in almost all areas of society growing in size, complexity, and functionality. This continuous growth demands the study, development, and implementation of new Software Engineering (SE) methodologies and tools (e.g., software analysis and design, software portability, formal verification and validation, software measurement, and software maintenance) to build more reliable software. However, despite the introduction of innovative approaches and paradigms useful in the SE field, their technological transfer on a larger scale has been very gradual and still almost limited. This is due to the critical aspects of SE with respect to other well-founded engineering disciplines since SE is strongly influenced by social aspects (i.e., human knowledge, skills, expertise, and interactions) that are highly context-driven, non-mechanical, and strongly based on context and semantic knowledge. Human factor characterizes many of the problems associated with SE, including those observed in development effort estimation, software quality and reliability prediction, software design, and software testing.
"Project smells" -- Experiences in Analysing the Software Quality of ML Projects with mllint
van Oort, Bart, Cruz, Luís, Loni, Babak, van Deursen, Arie
Machine Learning (ML) projects incur novel challenges in their development and productionisation over traditional software applications, though established principles and best practices in ensuring the project's software quality still apply. While using static analysis to catch code smells has been shown to improve software quality attributes, it is only a small piece of the software quality puzzle, especially in the case of ML projects given their additional challenges and lower degree of Software Engineering (SE) experience in the data scientists that develop them. We introduce the novel concept of project smells which consider deficits in project management as a more holistic perspective on software quality in ML projects. An open-source static analysis tool mllint was also implemented to help detect and mitigate these. Our research evaluates this novel concept of project smells in the industrial context of ING, a global bank and large software- and data-intensive organisation. We also investigate the perceived importance of these project smells for proof-of-concept versus production-ready ML projects, as well as the perceived obstructions and benefits to using static analysis tools such as mllint. Our findings indicate a need for context-aware static analysis tools, that fit the needs of the project at its current stage of development, while requiring minimal configuration effort from the user.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
- Europe > Netherlands > South Holland > Delft (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
- Research Report > Promising Solution (0.68)
- Research Report > New Finding (0.48)
- Banking & Finance (0.66)
- Information Technology (0.46)
Automation: An Essential Component Of Ethical AI?
Nallur, Vivek, Lloyd, Martin, Pearson, Siani
Ethics is sometimes considered to be too abstract to be meaningfully implemented in artificial intelligence (AI). In this paper, we reflect on other aspects of computing that were previously considered to be very abstract. Yet, these are now accepted as being done very well by computers. These tasks have ranged from multiple aspects of software engineering to mathematics to conversation in natural language with humans. This was done by automating the simplest possible step and then building on it to perform more complex tasks. We wonder if ethical AI might be similarly achieved and advocate the process of automation as key step in making AI take ethical decisions. The key contribution of this paper is to reflect on how automation was introduced into domains previously considered too abstract for computers.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Europe > Germany (0.04)
When programming became a chore
A conversation I had with one of my teachers in ninth grade still comes back to me occasionally. We were discussing how to interact with people, and how to do so without coming off as an arrogant asshole. This was certainly an issue that warranted discussing. She knew I liked programming, and suggested approaching it as a problem that needed solving. At some point, she suggested the business jargon of "solutions", having heard that from somebody in the tech industry.
- Information Technology (0.68)
- Education > Educational Setting > K-12 Education > Secondary School (0.54)